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We consider forecasting the latent rate profiles of a time series 
of inhomogeneous Poisson processes. The work is motivated by op- 
erations management of queueing systems, in particular, telephone 
call centers, where accurate forecasting of call arrival rates is a cru- 
cial primitive for efficient staffing of such centers. Our forecasting 
PLh , approach utilizes dimension reduction through a factor analysis of 

.^^ • Poisson variables, followed by time series modeling of factor score 

• ' series. Time series forecasts of factor scores are combined with factor 

j^ , loadings to yield forecasts of future Poisson rate profiles. Penalized 

Poisson regressions on factor loadings guided by time series forecasts 
of factor scores are used to generate dynamic within-process rate up- 
dating. Methods are also developed to obtain distributional forecasts. 
Our methods are illustrated using simulation and real data. The em- 
pirical results demonstrate how forecasting and dynamic updating of 
call arrival rates can affect the accuracy of call center staffing. 



1. Introduction. Queueing models usually assume that arrivals to a queue- 
ing system follow an inhomogeneous Poisson process. For efficient manage- 



QQ ' ment of such a system, accurate prediction of the underlying random rate 

^D ■ function of the inhomogeneous Poisson process is of primary importance. 

For example, various staffing models need the forecasted rate as one es- 
sential input [Cans, Koole and Mandelbaum (2003)]. Both the latency and 
uncertainty of the rate makes such a prediction problem interesting and dif- 



5^ I ficult. The current paper proposes a methodology for forecasting the arrival 
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2 H. SHEN AND J. Z. HUANG 

rate functions of a time series of inhomogeneous Poisson processes. The pro- 
posal combines the Poisson process assumption with ideas of data-driven 
dimension reduction as well as time series forecasting. 

Our primary motivation and application come from call centers, where 
customers contact their service providers through telephones and wait in 
tele queues. The daily call arrivals to a call center form a point process that 
can be modeled as an inhomogeneous Poisson process [Brown et al. (2005)]. 
Motivated by call center operations management, two forecasting problems 
will be addressed: to forecast future rate functions days or weeks ahead, 
and to update existing forecasts using additional information during a day. 
Throughout the paper, we will use call centers to facilitate our presenta- 
tion; however, the developed methodology undoubtedly has a much wider 
application spectrum. 

In this paper a daily call arrival process is modeled as a Poisson process 
with a stochastic intensity. In other words, a call arrival process is a Cox 
process [Cox (1955)], also known as a doubly stochastic Poisson process. 
However, instead of focusing on a single daily arrival process, we consider a 
time series of daily arrival processes. What makes our paper different from 
the literature of Cox processes is the time series dependency of the daily 
intensity functions. Such dependency makes it feasible to perform time series 
forecasts of the intensity functions using historical data. 

Call centers have become a primary contact point between companies and 
their customers in modern business [Mandelbaum (2006), 
Aksin, Armony and Mehrotra (2007)]. The forecasting of future call arrival 
rates is critical for efficient agent staffing and scheduling, but the current 
practice is still in its infancy [Cans, Koole and Mandelbaum (2003)]. To 
close the gap, recently more research efforts have been devoted to call cen- 
ter forecasting. Among them, Weinberg, Brown and Stroud (2007) propose 
a model-based approach where a two-way multiplicative Bayesian model for 
inhomogeneous Poisson processes is used to predict the Poisson rate func- 
tions; while the data-driven approach of Shen and Huang (2008) combines 
dimension reduction through singular value decomposition (SVD) with time 
series forecasting, and provides a more robust solution to the problem that 
does not rely on the multiplicative model assumption. To the best of our 
knowledge, these are the only two papers that have dealt with the problem 
of dynamic intraday updating. 

However, instead of modeling the arrival counts directly, both approaches 
model the square-root transformed counts, which are approximately nor- 
mally distributed for Poisson counts [Brown et al. (2007)]. The data-driven 
approach in Shen and Huang (2008) does not make use of the Poisson as- 
sumption, and aims at forecasting future counts. To the extent that some 
stochastic model assumption is reasonable for the counts, such as inhomoge- 
neous Poisson processes, incorporating it appropriately should improve the 
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statistical efficiency of the metliod. Note that forecasting the latent arrival 
rates of the Poisson processes is of primary interest from the management 
point of view (see Section 4.1). Although the point forecasts for counts 
and rates are the same, their distributional forecasts are different. Fur- 
thermore, neither Weinberg, Brown and Stroud (2007) nor Shen and Huang 
(2008) has considered the managerial consequences of different forecasts of 
the rates of the Poisson processes. 

The purpose of this paper is to develop a forecasting method that utilizes 
the widely used Poisson assumption in queueing models. More precisely, 
we consider a time series of inhomogeneous Poisson processes and use the 
historical data to forecast the future rate functions of the Poisson processes. 
In the call center application, for example, each Poisson process models the 
customer call arrivals during one day in a sequence of days. Since the rate 
function of each Poisson process is approximately constant during short time 
intervals (such as quarter hours in our data example in Section 4), the data 
for a given day can be aggregated into a vector of Poisson random variables 
with varying rates, counting the call arrivals during these intervals. The 
collection of such rates for a given day is referred as the rate profile and the 
corresponding counts form the count profile. The unobservable rate profiles 
form a vector time series, and our goal is to forecast future rate profiles 
using the observed historical count profiles. 

The dimensionality of the vector time series of count profiles is usually 
high. Thus, we propose to first reduce dimension by building a factor model 
on the hidden rate profiles. The proposed dimension reduction technique 
can be viewed as an extension of singular value decomposition to Poisson 
data, or Poisson SVD. According to the factor model, every rate profile can 
be expressed as a linear combination of a few factors. Factor loadings and 
scores can be computed by fitting generalized linear models (GLM) [Dobson 
(2001)]. An alternating maximum likelihood algorithm is employed that fits 
marginal Poisson regression models by alternately fixing the factor loadings 
and scores. The algorithm is closely related to the alternating least squares 
algorithm for computing singular value decomposition [Gabriel and Zamir 
(1979)]. 

After the factor analysis, forecasting future rate profiles reduces to fore- 
casting time series of factor scores. Univariate time series models can be 
built for each factor score series to produce time series forecasts of the rate 
profile. To achieve within-process dynamic rate updating, we propose to ef- 
fectively combine the information contained in both the historical processes 
and the current process, through maximizing a penalized likelihood crite- 
rion. The criterion appropriately balances two measures, the goodness-of-fit 
of the factor model to the early segment of the current process and the devi- 
ation of the factor scores from their forecasts provided by the interday time 
series forecasting. 
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Our forecasting approach enjoys the benefit of both the model-driven ap- 
proach of Weinberg, Brown and Stroud (2007) and the data-driven approach 
of Shen and Huang (2008). It directly models and forecasts the rate profiles 
of inhomogeneous Poisson processes. [Some earlier analysis is briefly doc- 
umented in Shen, Huang and Lee (2007).] Making no rigid model assump- 
tions on the rate profiles, our approach is robust, as shown in two simulation 
studies. In this paper we also illustrate how rate forecasting and dynamic 
updating benefits the staffing decision in a real call center application, which 
has not been considered before in the literature. 

The data we analyze and use for building forecasting models have an inter- 
esting two-way structure: there are both day-to-day variation/dependence 
and within-day variation/dependence. This specific structure differentiates 
our work with the related literature on longitudinal data analysis and dy- 
namic factor models, where only one-way structure is present. Longitudinal 
data analysis [Diggle et al. (2002)] concerns repeated measurements from 
many subjects. While there are typically within subject correlation, the mea- 
surements from different subjects are usually independent. Dynamic factor 
models are commonly used in econometrics for analyzing economic time 
series [Stock and Watson (2005)]. Since economic variables do not have a 
natural ordering as the one present in our within-day call volume profiles, 
intraday dynamic updating is not of concern in the dynamic factor models 
literature. Moreover, we are unaware of any work on dynamic factor models 
for Poisson count data. 

The rest of the paper is structured as follows. Section 2 describes the factor 
model as well as the alternating maximum likelihood estimation algorithm 
for model estimation. Our forecasting approach is described in detail in 
Section 3. The proposed method is applied in Section 4 to the call center 
arrival data analyzed by Shen and Huang (2008). Some background on call 
center staffing is briefly discussed in Section 4.1. Section 4.2 fits the factor 
model to the real data. Two simulation studies are presented in Section 4.3.2 
to illustrate the robustness of our method and its performance on forecasting 
hidden rate profiles. The effect of rate forecasting on staffing level is discussed 
in Section 4.3.3. In Section 4.4 several intraday rate updating methods are 
used to update the related staffing levels. We conclude in Section 5 with 
some discussion. 

2. Dimension reduction of Poisson variables. 

2.1. A Poisson factor model. Consider a time series of inhomogeneous 
Poisson processes that are observed over the same time interval. Following 
common practice, we assume that the arrival rate function of each Poisson 
process can be well approximated as being piecewise constant over short 
time intervals. Let Y = (yij) be an n x m matrix that records the arrival 
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counts from n such processes with each process being aggregated into m time 
intervals. Furthermore, we assume that yij is a Poisson random variable with 
rate Aj^. For notational purpose, we let A = (Ajj) denote the n x m hidden 
Poisson rate matrix. The ith row of A, denoted as A(•j^ = (Aji, . . . , Xim), is the 
rate profile of the ith process. Correspondingly, the ith row of Y, denoted 
as y^) = {vii, ■ ■ ■ ,yim)i is the count profile of the ith process. Note that the 
rows of A and Y are time ordered. 

The rate profiles {Xu\}'^^^ then form a vector- valued time series taking 
values in M™. It is of interest to forecast future rate profiles X{^n+h) (h > 0). 
However, the difficulty is that the rate profiles are unobservable. The idea is 
to build a time series forecasting model on the corresponding count profiles 
{y{«)}r=i' which can then be used to forecast X(^n+h)- 

In practice, the dimensionality of the vector time series {y(j)} is usually 
so large that it is infeasible to directly apply the classical vector autoregres- 
sive and moving average (VARMA) models [Reinsel (1997)]. For example, 
the call center application in Section 4 has m = 68, and the application of 
Weinberg, Brown and Stroud (2007) has m = 169. This calls for the neces- 
sity of dimension reduction. In addition, each y^j) is a vector of Poisson 
random variables with a positive rate vector A/j) . The Poisson nature of the 
data needs to be accounted for appropriately. 

For dimension reduction of Poisson variables, we consider the following 
i^-factor model, 



(2.1) 



■y(i) ~Poisson(A(j)), i = l,...,n, 

ff(A(i)) = /3jifi + • • • + PiK^K = F/3(i), 

where /3(j) = {Pn, ■ ■ ■ ,PiK)'^ is the K-vector of underlying factor scores for 
the ith rate profile, FmxK = (fi, • • • , fx) contains the K factor loading vec- 
tors in W^, and the transformation g is a link function suitable for Poisson 
variables such as the logarithmic or square root function. Here and through- 
out the paper, application of the link function g to a vector or matrix is un- 
derstood as a componentwise operation. See McCullagh and Nelder (1989) 
and Dobson (2001) for more discussion on generalized linear models (GLM) 
and, in particular, specification of link functions. 

Denote the factor matrix as B^xX = (/3(i)i • • • )/3(„,))^- Then the factor 
model for the transformed rate profiles can be written in the following matrix 
form: 

, . /Y~Poisson(A), 

^^•^^ U(A) = BF^. 

The factor model summarizes the transformed rate profiles using K com- 
mon factors, denoted as /3|, . . . ,/3x' which are the columns of B. Note that 
each /3^ is a time series of factor scores. How to build time series models on 
these series for forecasting will be discussed in Section 3. In practice, one 
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would assume K is usually much smaller than n or m, resulting in a consid- 
erable amount of dimension reduction. The above model (2.2) may appear 
similar to dynamic factor models (DFM) commonly used in econometrics 
[Stock and Watson (2005)]. However, there are two major differences. First, 
our loadings for the same process are time dependent and correlated, a useful 
feature for within-process updating discussed later in Section 3.2, while such 
dependence is missing for DFM because economic variables usually have no 
natural ordering. Second, our model deals with Poisson variables, which to 
the best of our knowledge have not been analyzed in the DFM literature 

As it stands, the factor model (2.2) is not identifiable. To achieve identifi- 
ability, one can require either fjffc' = 6kk' or fij^fiy = 6kk', where 6kk' is the 
Kronecker delta that equals 1 for k = k' and otherwise. See Section 2.2 for 
more details. 

2.2. An alternating maximum likelihood algorithm. Assuming indepen- 
dence of data, one can estimate the factors B and the loadings F using 
maximum likelihood. However, since both B and F are unknown, direct 
maximization of the likelihood function is complicated. Below, we propose 
an iterative procedure that relies on alternating optimization over B and F. 

Fixing F, the factor model (2.1) represents n standard Poisson regression 
models with the count profiles y(j) as responses and F as the common matrix 
of covariates. On the other hand, the same factor model can be written in 
terms of the columns of Y (denotes by {yj}) as 

f yj~ Poisson (Aj), j = l,...,m, 
^ ■ ' U(Aj) = B%). 

Fixing B, the model consists of m Poisson regressions with B as the common 
regressor matrix. 

The above discussion suggests the following iterative algorithm: 

1. Initialize: Apply singular value decomposition (SVD) to ^(Y) to obtain 
the SVD components {U,V, S}, and set Bold = [si'^i, ■ ■ ■ ,sk'^k] and 
Fold = [vi , . . . , vx] , where u^ is the kth column of the left singular vector 
matrix U, Vfc is the A;th column of the right singular vector matrix V 
and Sfc is the fcth diagonal element of the ordered diagonal singular value 
matrix S. 

2. Update: 

(a) Fit the Poisson regression model (2.1) with F replaced by Fold, and 
obtain an updated factor score matrix Bnow! 

(b) Fit the Poisson regression model (2.3) with B replaced by Bncw) and 
obtain an updated factor loading matrix Fnew! 

(c) Apply SVD to the product matrix BnewF^g^, and denote the ob- 
tained SVD components by {Uncw, Vncwi Sncw}- 
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3. Repeat Step 2 with Bold and Fold replaced by the first K columns of 
UnewSnew and Vncw respectively until convergence. 

The algorithm is presented in such a way that the factor loadings are 
orthonormal in that F-^F = /. Alternatively, we can modify the algorithm 
to make the score vectors satisfy B B = /. To achieve that, we can set B to 
be the first K columns of the left singular vector matrix U, and F to be the 
corresponding columns of SV. This alternative formulation turns out to be 
more useful for the penalized dynamic updating approach to be discussed 
in Section 3.2. 

Our Poisson factor model extends singular value decomposition (SVD) to 
Poisson count data. The above alternating maximum likelihood algorithm is 
an extension of the alternating least squares algorithm for computing SVD 
[Gabriel and Zamir (1979)]. 

2.3. Selection of the number of factors. In practice, one needs to decide 
on K, the number of underlying factors. Since we are in a forecasting con- 
text, one possibility is to compare out-of-sample forecasting performance 
for different choices of K. Note that the forecasting performance measure 
can only be calculated using the counts, as the rates are not observed. Our 
experience suggests that increasing K tends to improve forecasting perfor- 
mance, but the improvement becomes minimal beyond a particular value of 
K, denoted as K* , as additional factors start to fit the noise in the data. 
This K* can be used as the "true" number of factors. 

Alternatively, one can extend the idea of scree plots from principal com- 
ponents analysis (PCA) to the current context. In PCA, the eigenvalues of 
the sample covariance matrix are ordered decreasingly and plotted against 
their indices in a scree plot. An "elbow" in the scree plot, at which the slopes 
of lines joining the plotted points change from "steep" to "shallow," is usu- 
ally used to locate the number of significant principal components (PCs). 
To extend scree plots to Poisson variables, note that the eigenvalues equal 
to the reductions of residual sum of squares in reconstruction of the data 
using PCs as more PCs are added. 

In GLM, a commonly used goodness-of-fit measure is the so called de- 
viance [Dobson (2001)]. Consider the i^-factor model (2.1), and let B and 
F denote the maximum likelihood estimates of the parameters. Denote the 
corresponding maximized log-likelihood as 

/(Y;A = A) with A = 5"^ (BF^). 

In addition, consider the saturated model where the rate matrix A is esti- 
mated as the count matrix Y, and denote the corresponding log-likelihood 
as l(Y;A = Y). The deviance of the model (2.1) is then defined as 

2{/(Y; A = Y) - 1{Y; A = A)}. 
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Similar to the residual sum of squares, the deviance decreases as K increases. 
We propose to calculate the reduction of deviance of the model (2.1) for an 
increasing sequence of K, and plot it against K in a deviance reduction plot. 
The number of necessary factors can then be suggested by an "elbow" in the 
plot. See Figure 1 for an illustration of this idea, which shows that the steep 
drop of the deviance slows down significantly beyond the elbow around four 
or five factors. 

3. Forecasting. 

3.1. Forecasting future rate profile. Given the historical count data ym, 
i= l,...,n, and assuming the latent Poisson rates satisfy (2.1), consider 
forecasting the future rate profile \n+h){^ > 0). According to (2.1), 

\n+h) = 9~ {Pn+h,lil H 1" Pn+h,K^K)- 

Since the factor loading vectors fi,...,fft: can be obtained from historical 
data using the alternating maximum likelihood (AML) algorithm, forecast- 
ing the m-dimensional rate profile Xtn+h) reduces to forecasting the K- 
dimensional vector of factor scores jSin+h) — {Pn+h,i^ • ■ • > /9n+/i,ft:}^- Suppose 
we can obtain such a forecast l3j^_^f^-s, then a forecast of X(^n+h) follows as 

\TS _„-!/' /qTS f , I /5TS f N 

^(n+h) — 9 [Pn+h,l^l H r Pn+h,K^K), 

where P^+hk ^^ ^ time series forecast of Pn+h,k, I < k < K. Because the 
count profile y(„+/,) has a Poisson distribution with rate A(„+/j), the point 
forecast of y(.„+h) is the same as ^Jn+h) ■ 

In addition to the factors, the AML algorithm also produces the matrix 
B of factor scores, whose row vectors /3n), . . . ,/3(,„) form a ET-dimension 
time series. This vector time series is subject to time series modeling to 
generate desired forecast fSj^.fj^) ■ Note that the dimension K of this vector 
time series is much smaller than the dimension m of the original observa- 
tion 2/(j) . However, we propose to model the score time series for each factor 
separately using univariate time series models such as exponential smooth- 
ing, ARIMA models [Box, Jenkins and Reinsel (1997)] or state space models 
[Harvey (1990)]. An appropriate model can be decided on by analyzing his- 
torical data. The rational for this separate univariate time series modeling 
lies in the alternating estimation algorithm (Section 2.2). We observe that 
the score series Pj. is orthogonal to score series P'j^ for k ^k' . This lack of 
contemporaneous correlation suggests that the cross-correlations at nonzero 
lags are likely to be small. Hence, it suffices to forecast each fi^ separately. 

To generate interval and distributional forecasts, we modify the bootstrap 
approach described in Shen and Huang (2008). For a fixed k, we use the 
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fitted time series model for tlie series P^^ recursively, and bootstrap the 
model errors from the fitted model to generate a sample of B forecasts 
{l^n+h k}^^ — ^ — ^- Then B forecasts of X(^n+h) can be obtained as 

^^!ih)=9-\p:itji+---+p:iiKfK), b=i,...,B, 

which provides a distributional forecast for X^^+h) ■ To obtain a distributional 
forecast of the count profile, for any b, we randomly sample one count profile 

^TS b ■ '^TSjfe 

forecast y /^I^n from Poisson distributions with rate X^^+h) ■ The sample of B 
forecasts of ytn+h) then leads to its distributional forecast. Interval forecasts 
can be obtained easily using quantiles of the distributional forecasts. 

3.2. Dynamic within-process rate updating. In addition to the historical 
count profiles y(i) , . . . , y(„) , sometimes one also observes the counts of the 
(n + l)th process during the first ttiq time intervals. It is of interest to use 

them to update Xin+h)^ the time series forecast obtained previously. Such up- 
dating can reduce forecasting errors substantially [Weinberg, Brown and Stroud 
(2007), Shen and Huang (2008)]. For call centers, the ability to change agent 
schedules in response to the updating then yields operational benefits. For 
example, to adjust for a revised forecast, call center managers may send 
people home early or have them perform alternative tasks, or they may call 
in part-time or work-from-home agents. 

Denote the new observations from the {n + l)th process collectively as 
yfn+i) ^^'^ t^^ corresponding Poisson rate vector as Af^_(_j^\. Let yL_|_i\ = 
(?/n+i,mo+i) • • • ) yn+i,m)'^ be the latter count profile for the {n + l)th process 
with the rate vector being A/ _^^y For notational simplicity, we suppress 
the dependence of y(„^;^)/A(„_^^) and y|„^;^)/A[„_^-^) on mo in the following 
discussion. There are two sets of information, {y(i), ■ . • ,y(n)} and y'/^_^_iy 
We are interested in making use of both to obtain an updated forecast for 

a' 

-^(n+l)- 

Using historical data, the alternating maximum likelihood algorithm in 
Section 2.2 can be run to get estimates of the factor loading vectors fi, . . . , fi^: 
and the factor score series /3^ , . . . , /3^ . Time series models are then built on 
/3i, . . . , Px to generate forecasts /3^+i i, . . . , Pn+i K- Therefore, just using the 
historical count profiles, the time series forecast of A(„_,_]^\ is given by 

(3.1) A^^^ = g-\i3ll,^A + ■■■ + /3^|i,^4), 

where f^ is the latter segment of ffc after the initial mo time intervals. How- 
ever, this forecast does not utilize any new information contained in y^+i- 
The new information can be incorporated using a Poisson regression as 
we describe now. Suppose the factor model (2.1) holds for the rate profile 
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A(„_i_i) , then we have the following expression, 

(3.2) g{Xn+lj) = Pn+l,lfjl^ \- f3n+l,KfjK, j = 1, 



,m. 



Let F^ be an mo x K matrix whose {j,k)th entry is fjk, 1 < J < thq, 
I <k < K, and let /3(„+i) denote the vector (/3n+i,i, • • • ,f3n+i,K)'^ of factor 
scores. The following Poisson regression model can then be built on yf^,^-., 

.3 3^ f yfn+i) ~ Poisson(A^„+i)), 

This suggests that we can forecast P/^_^_i\ by maximum likelihood (ML), 
solving the following optimization problem. 



13. 



mo 

mm ^{A„+ij - yn+i,j log(A„+ij)} subject to g{XL_^i)) = F''/3( 



which minimizes the negative log-likelihood function. Note that the regressor 
F*^ in the Poisson regression is obtained using historical data. 

However, the above Poisson regression approach ignores the time series 
dependence among the rate profiles, present in the factor score series. To im- 
prove the ML forecast, we propose to combine it with the time series forecast 
of /3(„_|_i) using a penalized Poisson regression. Specifically, we minimize the 
following penalized likelihood criterion with respect to /3(„^;^) , 



mo 



TS ||2 



(3.4) 



J2{^n+i,j - yn+i,j log(A„+ij)} + a;||/3(„+i) - /3jf+i) 



subject to g{X1n+i) ) = F^/3(„+i) , 



where fSf^.D is a time series forecast based on the historical count pro- 
files, and a; > is a penalty parameter. To simplify our procedure, only one 
penalty parameter is used. Thus, it is desirable for the K time series {/3fc} to 
be roughly on the same scale. This can be achieved by requiring them to be 
orthonormal in the alternating maximum likelihood algorithm (Section 2.2). 

The penalized criterion (3.4) involves two terms: the first term measures 
the goodness-of-fit of the model to yL 1 1) in terms of likelihood, while the 
second term penalizes a large departure from the time series forecast. Its 
solution is a compromise between the two terms based on the size of to, the 
penalty parameter. In practice, uj can be selected based on the forecasting 
performance on a rolling hold-out sample [Shen and Huang (2008)]. 

Below in Section 3.3, we describe an iterative re-weighted least squares 
(IRLS) algorithm for minimizing (3.4) based on a quadratic approximation 



FORECASTING INHOMOGENEOUS POISSON PROGESSES 11 

of the objective function. The minimizer then gives us the penahzed maxi- 
mum hkehhood (PML) forecast of /3(„+i) • The PML forecast of \(n^i) (and 

y(„+i)) is then given by 

(-3 t;^ il,PML _ 1 , APML fl, , APML fl x 

For distributional updates of X/^_^_l^ , we propose to use a bootstrap pro- 
cedure similar to the one aforementioned in Section 3.1. First, by bootstrap- 
ping the errors from the time series forecasting models for /3(„+i) , we obtain 

B time series forecasts d, ' ^ , which in turn lead to B PML forecasts 3, , A 
by minimizing the criterion (3.4). The derivation of the B PML forecasts can 
be sped up considerably via a one-step updating approximation as discussed 



at the end of Section 3.3. The B PML updates of ^(n+i) are constructed as 



follows: 

\PML,fe_ -lfpPML,bj: , , pPML,b f \ 

\+l,j - 9 {Pn+1,1 Jjl-^ ^ Pn+l,KJjK), 

j = mo + l,...,m;b = l,...,B. 

The interval and density forecasts of ^n+i,j are obtained using the empirical 
distribution of A„^^ '• , b = 1,. . . ,B. To obtain a distributional update of 
the count profile, for 6 = 1, . . . , S, we randomly sample one forecast y„+ij- 

of Vn+ij from the Poisson distribution with rate X^+i j ■ "^^^ sample of B 
forecasts of Hn+ij then leads to its interval and density forecasts. 

3.3. An IRLS algorithm for penalized Poisson regression. Suppose we 
have some initial estimate of /9(„+i) denoted as /S^^+i) , and the correspond- 
ing estimate of \n+i) denoted as Xf^+i) • "^^^ minimizing criterion (3.4) is 
equivalent to 

mo 



(3.6) 



C(/3(„+i)) = ^{A„+ij - yn+ijlog(A„+ij)} 

K 
k=l 



Introducing A/^_,_j^\, the criterion (3.6) can be re-written as 



mo 



- yn+l,j{log{Xn+l,j) - log(A°+i •)}] 
(3.7) 
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mo 



+ Ei^n+l,i - Vn+lJ log(AO+i^j.)} 






+ a;f](/3„+i,fe-/3^ 



where A„+ij = g Hf|^/3(„+i)) and A°+^^- = 5 HfJ^/^Jn+i))- Dropping the 
sub/super-scripts, the summand in the first term of (3.7) is 

(3.8) (A-AO)-y{log(A)-log(AO)}, 

where A = 5~^(f^/3) and A*^ = g~^{i^p^). It follows from a second-order 
Taylor expansion that 

(A - AO) - y{log(A) - log(AO)} = w{y, f^/30){f^/3 - y*{y, f^/30)}2 + c(/30), 

where c(/3 ) is a constant given the initial estimate /3 , while the specific 
expressions of the functions w[-,-) and y*{-,-) depend on the link function 

9- 

The exact expressions oi w{-,-) and y*{-,-) for several commonly used link 
functions are listed below. For the identity link, 

for the logarithmic link, 

t.(y,f^/30) = if^/3°, y*(y,f^/3°) = f^/3° - (1 - ye^^"^"); 
and for the square-root link, 

«;fy f^/3°) = 1 + ^ 7y*(T/ f^/3°) = f^/3° - (^^^°^^ ~ ^^^^° 

Note that, for all three links, the induced response variable y* is a shifted 
version of f-^/3 . 

Plugging back the sub/super-scripts, barring any additive constant inde- 
pendent of j3inj^i\ , the quadratic approximation of the minimizing criterion 
C{l3ij^j^i\) is equivalent to 

mo 

^ u;(y„+i,„ ff /3°„+i)){ff /3(„+i) - 2/*(y„+i,,-, ff /30„+i))}2 

K 

+ '^^{Pn+l,k - fin+l,k) ■ 
k=l 
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In matrix form, the above quadratic approximation can be written as 

C(/3(n+i)) ^ (y(n+i) - F^/3(„+i))^W(y^„+i) - F^/3(„+i)) 



+ ^{(^(n+l) - /3(„+l)) (/3(n+l) - P 



(n+l)J 
eT aO 



jth. entry, while the diagonal weighting matrix W has w{yn+i,j,ij'^ Pfn+i)) 
as the jth diagonal entry. Hence, the minimizer of quadratic approximation 



where the induced response vector yL^^i) has y*(y„+ij,f? /3V^^-^^) as the 

jth entry, whil( 
as the jth diagc 
of C(/3(„+i)) is 

(3.9) ^(„+i) = (F^^WF-^ + a;I)-HF^^Wy^„+,) +^^TS+i)). 

The above derivation suggests the following iterative re-weighted least 
squares (IRLS) algorithm to solve the minimization problem (3.4): 

1. Initialize: Fit the Poisson regression model (3.3) on yL 1 1) and use the 
maximum likelihood estimate of P/n+i) as ^1^+1) ■ 

2. Update: Calculate the updated estimate /3(n+i) according to (3.9) for a 
particular time series forecast (3j^_^-^y 

3. Repeat Step 2 with /3(„_|_i) replaced by newly updated /9(n+i) until con- 
vergence. 

When using the above IRLS algorithm to derive a distributional forecast 
for /3(„+i), the algorithm needs to be iterated until convergence for each 

bootstrapped time series forecast (3,'-^^^ . In our implementation, we initial- 
ize the algorithm by setting f^^^+i) *° ^^ ^^^ penalized point forecast /3(„_|_i) , 
and iterate the algorithm one step toward the bootstrapped forecast ^/^li\ ■ 
This simplification helps reduce the computational effort and works well in 
our examples. 

4. Application to call center data. In this section we illustrate the pro- 
posed methods using the call center data analyzed in Shen and Huang (2008). 
For efficient staffing of the call center, we are interested in forecasting future 
daily arrival rate profiles (Section 4.3), and dynamically updating existing 
forecasts using new information within a day (Section 4.4). Shen and Huang 
(2008) focused on forecasting future call volumes instead of rates. In addi- 
tion to extending their work to deal with Poisson rate forecasting, we go 
one step beyond comparing various rate forecasts — the forecasted rates are 
used to calculate the required staffing level, that is, the number of agents 
necessary to staff the call center in order to achieve some service level con- 
straint (Sections 4.3.3 and 4.4). The effect of rate forecasting and updating 
on determining the staffing level has not been considered before. 
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The call center data record the incoming call volumes to a northeastern 
US bank call center during every quarter hour within the normal business 
hour (7:00AM-midnight). The data cover 210 weekdays between January 
6th and October 24th, 2003. Among them, ten abnormal days are excluded 
from the analysis, including six holidays where the call volumes are very 
low and four days where the data are missing [Shen and Huang (2008)]. 
Empirical research has recently suggested that it is appropriate to model 
the arrival process of customer calls to a call center as an inhomogeneous 
Poisson process [Brown et al. (2005)]. 

This section is organized as follows. We first provide some background on 
call center agent staffing in Section 4.1 to facilitate our later calculation of 
staffing levels. We fit the factor model (2.1) in Section 4.2 and develop time 
series models for the factor score series. In Section 4.3 simulation studies 
are first performed to investigate the rate forecasting performance of our 
methods. The effect of interday rate forecasting on staffing level is then 
illustrated using the real data. Various intraday updating methods are then 
compared in Section 4.4 in terms of the related staffing level forecasts. 

4.1. Background on call center agent staffing. Service systems such as 
call centers are usually analyzed as queueing systems. Queueing theoretic 
models, such as Erlang-C (or M/M/N queue), are used to balance agent 
utilization and quality of service (QoS) according to some pre-specified QoS 
measure [Cans, Koole and Mandelbaum (2003)]. The capacity planning and 
workforce scheduling process in such systems usually takes place in three 
steps. First, historical arrival data are used to generate forecasts of arrival 
rates in each time period over some planning horizon. Second, queueing 
models are used to determine staffing levels for each of the time periods. 
For example, one simplistic yet previously commonly-used staffing strategy 
is the 80-20 rule, which finds the minimum number of agents to staff a 
system such that "80% of the customers will be served while their waiting 
(or delay) time is less than 20 seconds." Finally, the staffing level obtained 
in the second step is used as inputs to some mathematical programming to 
generate a set of agent schedules. 

Below we focus on the second step of deciding the staffing level, and illus- 
trate how improved rate forecasting, especially dynamic intraday updating, 
can improve the staffing accuracy. Suppose the arrival rate during a time 
period is A, and an agent can handle // calls per unit of time. Then, the 
offered load is defined as R = A//U. Intuitively, this gives the minimum num- 
ber of agents that a call center needs in order for the system to be stable. 
Otherwise, more calls will arrive than the agents can process, and the queue 
will eventually explode. 

One well-known heavy traffic queueing-theoretic result [Halfin and Whitt 
(1981)] is that for highly utilized moderate to large systems such as the call 
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center we analyze here, the required staffing level N is approximately equal 
to 



(4.1) 



N = R + eVR, 



where 6 > is a service level parameter that is decided by the steady-state 
probability of calls being delayed, a. To be exact, 

(4.2) a- 



^ , 0H0)Y 



where $ and (j) are the standard normal distribution and density functions. 
For example, 6 = 1 corresponds to 22% of the calls being delayed in steady 
state. 

This simple staffing rule is known as the square-root safety staffing princi- 
ple, where 9\'U is the safety staffing needed above the minimum requirement 
R in order to achieve some desired service level. The reason that the safety 
staffing is in the order of the square-root of the offered load is nontriv- 
ial. Recently, Garnett, Mandelbaum and Reiman (2002) extend the above 
result to queueing systems with abandonment by allowing the service level 
parameter 9 to be negative. In call centers, some impatient customers end up 
abandoning the tele-queue before getting served. Square-root safety staffing 
is awfully simple. However, in order for it to work well in practice, one relies 
on accurate prediction of future offered load R, in particular, the arrival rate 
A, which is the focus of the current research. 



4.2. Model building. We considered A' = 1, ..., 10 and fit the correspond- 
ing factor model (2.1) to the call center data. The square-root link function 
was employed. The left panel of Figure 1 gives the deviance reduction plot, 
which suggests that five factors are sufficient as the reduction of the de- 
viance becomes flat afterward. The right panel plots the first five factor 
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The deviance reduction plot, suggesting five significant factors whose loadings 
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Fig. 2. Time series plot of the first score series and its lag-one scatter plot, suggesting 
an AR(1) time series model with a day-of-the-week dependent slope. 



loading vectors, which are normahzed to achieve comparable scales. The 
first loading vector roughly summarizes the average daily rate profile. The 
additional loading vectors capture various contrasts among different time pe- 
riods within a day. The findings are consistent with those in Shen and Huang 
(2008) except theirs are about the count (instead of rate) profiles. 

To perform interday forecasting and intraday updating, we first need to 
develop a forecasting model for each score time series. The left panel of 
Figure 2 plots the first score series /3^. Different symbols indicate different 
weekdays, revealing a strong day-of-the-week effect. The right panel shows 
that the factor scores of two consecutive weekdays are linearly related, con- 
ditioning on day-of-the-week. This motivated us to consider the following 
varying-coefficient AR(1) model, 

Ai =ai(di_i) + &i/3i_i,i +eji, 

where dj_i denotes the day-of-the-week of day i — 1, and the varying in- 
tercept ai depends on di-i. A more complicated model was also considered 
where the slope is allowed to depend on day-of-the-week. However, an F-test 
for the nested models returns a p- value of 0.7724, which suggests that the 
larger model does not provide any significant improvement. 

Further exploratory data analysis suggested similar varying intercept AR(1) 
models for the other score series. Hence, we opt to use such models to fore- 
cast the future scores, which are denoted as 



(4.3) (3ik — ak{di-i) + bkPi-i,k + £ik, 



, ,n,k = 1, 



,5. 
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4.3. One- day- ahead rate forecasting. We first describe in Section 4.3.1 
two useful parametric models motivated by the call center data. The models 
are used in Section 4.3.2 to generate data in simulation studies to test our 
inter day forecasting methods. By considering the effect of forecasting on 
staffing level, our forecasting methods will be further tested in Section 4.3.3 
using the real data. For our methods, we consider dimension reduction by the 
factor model (2.1) with K = 1,. . . ,5 underlying factors respectively, followed 
by time series modeling with (4.3) on each series of factor scores. We shall 
denote our methods as TSl to TS5 depending on the number of factors used. 

4.3.1. Two parametric models. We consider two parametric models, where 
the data yij ~ Poisson(Ajj), i = 1, . . . ,n, j = 1, . . . ,m, and the underlying 
Poisson rate is assumed to have a two-way structure characterizing re- 
spectively the day-by-day variation and time-of-day variation. These two 
kinds of variations are typical for call center data [Brown et al. (2005), 
Weinberg, Brown and Stroud (2007)]. By taking into count the Poisson na- 
ture of the observations, the two models extend the corresponding models 
used in Shen and Huang (2008), which instead assume the square-root trans- 
formed counts as Gaussian observations. 

A multiplicative model. The multiplicative (MUL) model assumes that the 
Poisson rates satisfy 



/%d = aad^j, dj = 1,2,3,4, 5, 

(AA-\ , ai-ad^ = b{ai-i-adi_-,)+r]i, ry^ ~ AA(0,(/)2), 

j 

Under this model, the square-root of the rate has a two-way multiplicative 
structure: the day-to-day variation (oj) is auto-regressive of order one with 
mean depending on day-of-the-week (dj); the time-of-day variation {'jdij) 
also depends on day-of-the-week. This model can be viewed as 
a non-Bayesian version of the Bayesian model proposed by 
Weinberg, Brown and Stroud (2007), except that they also require the in- 
traday variation jdij to be smooth. 

An additive model. The additive (ADD) model assumes that the Poisson 
rates satisfy 



(4.5) 



\ij = ^ + ai + (3j + -fdd , di = 1,2,3,4, 5, 

-ad, =b{ai-i -ad,^^) + r]i, r/j ~ AA(O,0^), 
Y^ai = Y,Pj = Yl ^dd = Y. ^dd = Y. ^dd = 0. 



Under this model, the square-root-transformed rate has a two-way additive 
structure: the day-to-day variation follows an AR(1) process adjusting for 
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day-of-the-week (arf.) effect; it also has an interaction with the time-of-day 
variation (7d,j). 



As mentioned earher, when yij ~ Poisson(Ajj), Xy = Jyij + 1/4 is ap- 
proximately M{y^Xij, 1/4). Hence, the estimation and forecasting methods 
described in Sections A. 3 and A. 4 of Shen and Huang (2008) are applicable 
for the above two models. 

Both Weinberg, Brown and Stroud (2007) and Shen and Huang (2008) 
considered an industry standard as their benchmark, which uses historical 
averages of the same day-of-the-week as forecasts. They both showed that 
the approach works poorly; hence, we choose not to consider it in our studies. 

4.3.2. Simulation studies of rate forecasting performance. Because the 
Poisson rate profiles are unobservable in practice, in order to evaluate the 
rate forecasting performance of our methods, we carry out two simulation 
studies, using the multiplicative model (4.4) and the additive model (4.5) 
respectively to generate data. The model parameters are decided on by fit- 
ting the models to the real data described at the beginning of Section 4 and 
rounding the parameter estimates properly. We generate 100 data sets from 
each model for use in our simulation studies. The simulation results (de- 
tailed below) show that, without a specific parametric model assumption, 
our approach can always perform almost as well as the method using the 
true data generating model. As a comparison, a method based on a specific 
model assumption performs worse if the model assumption is wrong. 

Seven forecasting methods are considered in our simulation studies, in- 
cluding forecasting assuming the multiplicative model (4.4), forecasting as- 
suming the additive model (4.5), and our methods TSl to TS5. For all 
methods, the true square-root link function is used. For each data set, a 
rolling out-of-sample forecast exercise is performed with the last 50 days as 
the forecasting set; and for each day in the forecasting set, the preceding 
150 days are used to derive the forecast. The parameters involved in the 
methods are re-estimated for each day in the forecasting set. 

To compare the performance of forecasting the underlying rate, two per- 
formance measures are calculated, the root mean squared error (RMSE) and 
mean relative error (MRE). For day i in each data set, we define 



RMSE, 



22(^^J->^^Jr and MRE, = :^^I^^^ -"^^1 






where Ajj is the forecast for \ij. 



The Mean RMSE and Mean MRE (%) of the forecasted rates are then 
calculated for each simulated data set. For each forecasting method, the 
two panels of Figure 3 plot the empirical cumulative distribution functions 
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Fig. 3. Comparison of empirical CDF of Mean RMSE for rate forecasting. The true 
model is highlighted in the legend using an asterisk. TS4 performs comparably to the true 
model, while the wrong model performs worse. 



(CDF) of the Mean RMSE calculated on the 100 data sets simulated using 
the multiplicative and additive models respectively. The true model is indi- 
cated in the legend using an asterisk. When the data are generated using the 
multiplicative model, the additive model is a misspecified model and vice 
versa. In both cases, our TS methods do not use the model assumptions on 
the form of the underlying Poisson rate. The improvement of TS5 over TS4 
is minimal; hence, the CDF for TS5 is omitted in both panels. 

Several observations can be made from the plots. For the TS methods, 
when increasing the number of factors K, the performance measure gets 
stochastically smaller; TS4 is competitively close to the true model; when 
K increases beyond five, the forecasting improvement remains minimal. This 
suggests that four underlying factors are sufficient to approximate the true 
models. In addition, the wrong parametric model results in stochastically 
inferior forecasts than TS4. The uniformly good performance of our method 
shows its robustness against model assumptions, which is important in prac- 
tice where the information of the true underlying model is hardly ever avail- 
able. The results for Mean MRE (%) are similar and are not shown. 



4.3.3. The effect of rate forecasting on staffing level. Below we apply the 
seven forecasting methods considered previously in Section 4.3.2 to the call 
center data, and illustrate how the rate forecasting affects staffing level. This 
important fundamental question has not been investigated previously in the 
literature. To calculate the staffing level, we apply the square-root safety 
staffing rule discussed in Section 4.1. 

We assume an average service time of 5 minutes per call, which corre- 
sponds to an agent service rate of 12 calls per hour, or 3 calls per 15-minute 
interval. This figure is chosen subjectively for illustration purposes based 
on the empirical analysis in Brown et al. (2005). Suppose the arrival rate is 
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Table 1 

Summary statistics (mean, m,edian, lower quartile Ql, upper quartile Q3) of RMSE and 

MRE of the forecasted staffing level in a rolling forecast exercise. The forecasting set 

contains 50 days. MUL, TS4 and TS5 perform comparable 







RMSE 






MRE 


(%) 






Ql 


Median 


Mean 


Q3 


Ql 


Median 


Mean 


Q3 


ADD 


14.28 


18.03 


20.31 


21.80 


4.63 


5.40 


6.29 


6.68 


MUL 


14.27 


16.65 


19.69 


20.38 


4.46 


5.06 


5.91 


6.30 


TSl 


14.91 


18.55 


20.63 


21.79 


4.96 


6.28 


7.07 


8.31 


TS2 


13.54 


17.51 


19.88 


20.89 


4.73 


5.41 


6.32 


6.88 


TS3 


14.74 


17.80 


19.77 


19.90 


4.63 


5.27 


6.12 


6.62 


TS4 


14.32 


16.55 


19.50 


20.27 


4.45 


5.06 


6.00 


6.42 


TS5 


13.98 


16.55 


19.46 


20.57 


4.44 


5.06 


5.98 


6.35 



forecasted to be \j for the jth interval. Then the offered load is Rj = Aj/3. 
To figure out the additional safety staffing, we assume that the call center 
expects a 22% delay probability in steady state, which then determines the 
safety staffing factor to be 1 according to (4.2). Hence, it follows that the 
required staffing level for interval j is 



A,/3 + yV3. 



We perform, on our data set, the same rolling one-day-ahead forecasting 
exercise as the one described in the simulation studies of Section 4.3.2. As 
the benchmark, we use the "oracle" staffing level decided by assuming the 
actual call volumes as the rate forecasts. Note that we wouldn't observe 
the actual volumes when actually planning the staffing in practice. We then 
treat the "oracle" staffing level as the truth when calculating performance 
measures of the forecasted staffing level decided by the various rate forecasts. 

Table 1 compares summary statistics of the RMSE and MRE of the 
staffing level forecasts from the seven methods. For the TS methods, the 
results from using the square-root link function are presented. In general, 
the forecasting accuracy improves as one increases K, the number of fac- 
tors. TS4 and TS5 give comparable results. On the ground of parsimony, 
TS4 may be preferred to TS5. The ADD model performs similar to TS2, 
while MUL similar to TS4/TS5. 

In practice, out-of-sample forecasting performance can be used to pick the 
link function. Our choice of the square-root link in Table 1 is primarily for 
comparison purpose, because both ADD and MUL models use the square- 
root link. In addition, we also tried the identity link and the logarithmic 
link for our methods in this example, which generated inferior results than 
the square-root link. 
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4.4. Dynamic intraday updating of rate and staffing level. The call cen- 
ter data are used to illustrate the benefit of dynamic updating. We focus 
on the 10:00AM updating and the 12:00PM updating, which means that 
intraday dynamic updating is performed at 10:00AM and 12:00PM, respec- 
tively. The benchmark is the TS4 method which performs no updating. We 
shall use the square-root link and K = 4 based on the forecasting results in 
Section 4.3.3. Correspondingly, we term our penalized maximum likelihood 
intraday updating approach as PML4. 

For comparison purpose, we also consider two alternative intraday up- 
dating approaches that combine the MUL/ADD forecasts with historical 
proportions (HP). Suppose the MUL point forecast for the rate profile of 
day n -|- 1 is A/^V'j') , which also gives the point forecast for the count profile 
y(„+i). For an updating point niQ-. 

• calculate the ratio R between the total number of calls arrived and the 
cumulative forecasted rate up to the time period ttt-o, 

Y^mo \MUL ' 

• update the MUL forecasts for the remainder of the day as 

K+i,j - -KA^+i J, J - mo + 1, . . . , m. 

We refer to the above updating method as HPM. One can replace the MUL 
forecast with the ADD forecast in the above algorithm, which leads to the 
HPA updating. The HPM and HPA updating methods given here are ex- 
tensions of similar methods for updating counts described in Section 4.2.2 
of Shen and Huang (2008). 

Below we apply the aforementioned updating methods to the call center 
data, and illustrate the effect of dynamic intraday updating on staffing call 
centers. The same rolling forecasting exercise as in Section 4.3.3 is performed, 
and we again use the square-root staffing rule to determine staffing level. 

For PML4, the value of the penalty parameter u; needs to be decided at 
each updating point. To this end, we consider a set of candidate values, and 
choose the one that minimizes some forecasting performance measure based 
on the call volumes (rather than rates), such as RMSE. Specifically, we use 
the beginning 150 days in our data set as the training set, and perform 
an out-of-sample rolling forecast exercise on this training set. Note that this 
training set does not overlap with the forecasting set we initially hold out for 
out-of-sample forecast evaluation. For the purpose of selecting ui, the last one 
third (i.e., 50 days) of the training set is used as a hold-out sample. For each 
day in this sample, its preceding 100 days are used to fit the Poisson factor 
model and to generate the PML4 updating for each given uj. Compute some 
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Table 2 

Summary statistics (mean, median, lower quartile Ql and upper quartile Q3) of RMSE 

of the updated staffing level for the 10:00AM and 12:00PM updatmgs. PML4 outperforms 

the other methods 



10:00AM updating 



12:00PM updating 





Ql 


Median 


Mean 


Q3 


Ql 


Median 


Mean 


Q3 


TS4 


12.42 


14.29 


18.03 


18.82 


12.42 


14.29 


18.03 


18.82 


HPA 


12.80 


17.04 


19.33 


21.46 


12.20 


13.23 


16.67 


16.33 


HPM 


12.71 


15.49 


19.02 


21.03 


11.98 


13.04 


16.17 


16.02 


PML4 


12.62 


14.81 


17.19 


18.10 


11.46 


12.87 


15.41 


14.54 



performance measure (e.g., RMSE) for every day in the hold-out sample and 
take the average. The uj that minimizes this average performance measure 
will be used for all days in the forecasting set. In this study, we consider uj 
from {0, 10, ... , 10^}, and to = 10^ is chosen for both updatings. 

Table 2 presents summary statistics of the RMSE of the forecasted staffing 
levels from TS4, HPA, HPM and PML4. The averages are calculated over 
the 50 days in the forecasting set. For a fair comparison, only data after 
12:00PM are used when calculating the RMSE. The superior performance 
of PML4 over the other methods is quite clear, which improves over TS4 
by 14.5% on average. We also observe that, for every intraday updating 
method, updating later always improves the forecasting accuracy. In addi- 
tion, intraday updating reduces the prediction interval width as illustrated 
in the right panel of Figure 4. 

Figure 4 provides a graphical illustration of the forecasted staffing lev- 
els for September 2nd, 2003. The left panel shows the required number of 
agents for every 15-minute interval between 10:00AM and 17:00PM, when 
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Fig. 4. Comparison between the required number of agents per 15-minute interval for 
September 2nd based on the forecasted rates from various methods. The PML4 10:00AM 
updating gives the most accurate staffing. 
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the majority of the caUs get connected to the center. We observe that the 
10:00AM updating results in a dramatic upward shift in the related staffing 
level, which is very close to the oracle staffing level. On the other hand, the 
TS4 forecast leads to uniformly under-staffing throughout the day. What 
happens is that September 2nd corresponds to the day after Labor Day. 
Consequently, the call center experienced an unusably high call volume that 
day, which was not accounted for by the TS4 method. The HP updating 
methods can somehow adjust for the upward shift; however, their perfor- 
mance is worse than the PML4 method. 

In the right panel of Figure 4, we superimpose the 95% prediction inter- 
vals for the staffing level resulting from the corresponding intervals for the 
forecasted arrival rates. Only PML4 and TS4 are compared. The number of 
bootstrapped forecasts is chosen to be i? = 1000. We observe that the in- 
tervals from the intraday updating (PML4) almost always cover the "true" 
staffing level, while the intervals without the updating miss for more than 
half of the time periods. In addition, intraday updating results in uniformly 
narrower prediction intervals, which suggests that the prediction is more 
precise. The average interval width for TS4 and PML4 is 84.86 and 49.65, 
respectively; hence, a 41.5% average reduction results from the intraday up- 
dating. The managerial benefit is that the call center manager can work 
with a much narrower range of staffing level in the presence of arrival rate 
uncertainty. 

5. Conclusion and discussion. In this paper we develop methods for fore- 
casting and dynamic updating of the latent and uncertain rate profiles of a 
time series of inhomogeneous Poisson processes. Our new approach extends 
the model-driven approach of Weinberg, Brown and Stroud (2007) and the 
data-driven approach of Shen and Huang (2008). The latter does not explic- 
itly consider forecasting the latent Poisson rates. Instead, the authors con- 
sider forecasting of call volumes that can be regarded as a noisy version of 
the rates. We propose a Poisson factor model that combines the distribution 
assumption with data-driven dimension reduction. An iterative algorithm 
is proposed for parameter estimation. Besides showing the competitiveness 
and robustness of our forecasting methods, we also illustrate how intraday 
updating can improve the accuracy of call center staffing, which is the first 
in the relevant literature and provides some important managerial insights. 

In terms of model fitting, our approach involves two stages: first fit a Pois- 
son factor model for dimension reduction, then fit time series models on the 
factor score series. If the time series models could be specified together with 
the factor model prior to data analysis, a full likelihood approach could be 
statistically more efficient. However, the time series models are usually not 
easy to specify beforehand, so our two stage procedure is natural for model 
building. Moreover, numerical calculation for the full likelihood approach 
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is much more complicated than our procedure, although it is conceptually 
simple. In our call center application, fast computation is of great concern, 
especially when performing dynamic within-day updating. Nevertheless, it 
is still interesting to explore the full likelihood approach. Whether statis- 
tical efficiency of using full likelihood translates into benefits in accurate 
forecasting deserves further research. 
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